The Big World Hypothesis and its Ramifications for Artificial Intelligence

2025-09-28
출판일: 2024-06-04
저자: Khurram Javed, Richard S. Sutton

Abstract

The big world hypothesis says that for many learning problems, the world is multiple orders of magnitude larger than the agent. The agent neither fully perceives the state of the world nor can it learn the correct value or optimal action for each state. It has to rely on approximate solutions to achieve its goals. In this paper, we make a case for embracing the big world hypothesis. We argue that even as computational resources grow, the big world hypothesis remains relevant. We conclude by discussing the implications of accepting the big world hypothesis on the design and evaluation of algorithms.

openreview.net/pdf?id=Sv7DazuCn8

1. The Big World Hypothesis

There are many problems that satisfy the big world hypothesis and many that do not. The problem of finding roots of a second degree polynomial admits a simple solution that always work. Representing the value function of the game of Go for all states does not have a simple solution. The big world hypothesis is more a statement about the class of problems we should care about than a fact about all decision-making problems. It can be made true or false by exercising control over the design of the environment and the agent (e.g., when developing benchmarks).

Developing algorithms for big worlds poses unique challenges. The best algorithms for big worlds might prefer fast approximate solutions over slow exact ones

2. Reconciling the Big World Hypothesis and Exponentially Growing Computation

계산능력이 증가해도 남아있는 두 가지 문제. 첫째는 계산능력이 증가하면 센서를 통해 에이전트에게 공급되는 데이터의 양도 함께 증가함. 한편 센서의 해상도(공간해상도 by 시간해상도)가 아무리 높아져도 세상의 정보를 그대로 전달하기엔 턱없이 부족하다. 두번째는 계산능력이 증가함에 따라 세상 자체도 덩달아 복잡해진다는 점.

First, it is not just our agents that are constrained by compute. The sensors used by our agents are also constrained by compute. …

The second problem with waiting for compute to grow is that as compute becomes more readily available, the world itself becomes more complex.

그러므로 The big world hypothesis는 일시적 제약에 대한 가설이 아님:

As computational resources increase so does the complexity of the world. The big world hypothesis is not a temporary artifact of limitations of our current computers. For many problems, the world will always be much larger than any single agent.

3. Existing Evidence Consistent with the Big World Hypothesis

(…omitted…)

4. Ramifications of the Big World Hypothesis on Algorithm Design

Online continual learning is an important solution method for achieving goals in big worlds.

Computationally efficient learning algorithms can be advantageous in big worlds.

Making progress on big world problems requires a different approach for evaluating algorithms.

One way to evaluate algorithms for big worlds is to test them on complex environments so that even our largest agents on the latest hardware are not over-parameterized. While this approach has merit, it makes it difficult to do careful and reproducible experiments.

The alternative is to restrict the computational capabilities of the agents instead of making the environments larger. The primary limitation of restricting agents is that we might miss out on emergent properties of large agents. However, a small agent learning in a non-trivial environment is still a better proxy for learning in big worlds than a large over-parameterized agent learning in the same environment. …

Restricting the computational capabilities of the agents is not trivial. There is no consensus on what aspects of the agents should be restricted. We could restrict the number of operations, the amount of memory, the amount of memory bandwidth, or the amount of energy the agent can use. The choice of constraints can have a significant impact on the performance of the agent.

One option is to match the constraints on the agent with the constraints imposed by current hardware. For example, if memory is cheaper than CPU cycles, we might want to restrict the CPU cycles. Alternatively, if accessing the memory is a bottleneck, we might want to restrict the memory bandwidth.

A second option is to limit energy usage. Energy is a universal constraint that can take into account the evolution of hardware overtime and can even drive research for designing better hardware for our agents. The downside of using energy as a constraint is that it is difficult to measure. Normally, the computer running the agent is also running the environment, an operating system, and other unrelated processes. Isolating the energy used by the agent from background tasks is challenging.

5. Conclusions

The big world hypothesis has direct implications on what we choose to study and how we evaluate our algorithms. It is not a temporary artifact of current limitations of our computers. It is imperative that we develop algorithms that can allow agents to achieve goals in big worlds. This requires developing computationally efficient algorithms for learning continually. It also requires rethinking the way we benchmark our algorithms.